deep learning system
Representing Beauty: Towards a Participatory but Objective Latent Aesthetics
What does it mean for a machine to recognize beauty? While beauty remains a culturally and experientially compelling but philosophically elusive concept, deep learning systems increasingly appear capable of modeling aesthetic judgment. In this paper, we explore the capacity of neural networks to represent beauty despite the immense formal diversity of objects for which the term applies. By drawing on recent work on cross-model representational convergence, we show how aesthetic content produces more similar and aligned representations between models which have been trained on distinct data and modalities - while unaesthetic images do not produce more aligned representations. This finding implies that the formal structure of beautiful images has a realist basis - rather than only as a reflection of socially constructed values. Furthermore, we propose that these realist representations exist because of a joint grounding of aesthetic form in physical and cultural substance. We argue that human perceptual and creative acts play a central role in shaping these the latent spaces of deep learning systems, but that a realist basis for aesthetics shows that machines are not mere creative parrots but can produce novel creative insights from the unique vantage point of scale. Our findings suggest that human-machine co-creation is not merely possible, but foundational - with beauty serving as a teleological attractor in both cultural production and machine perception.
- Europe > Austria > Vienna (0.14)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- (2 more...)
Development of a Neural Network Model for Currency Detection to aid visually impaired people in Nigeria
Nwokoye, Sochukwuma, Moru, Desmond
Neural networks in assistive technology for visually impaired leverage artificial intelligence's capacity to recognize patterns in complex data. They are used for converting visual data into auditory or tactile representations, helping the visually impaired understand their surroundings. The primary aim of this research is to explore the potential of artificial neural networks to facilitate the differentiation of various forms of cash for individuals with visual impairments. In this study, we built a custom dataset of 3,468 images, which was subsequently used to train an SSD neural network model. The proposed system can accurately identify Nigerian cash, thereby streamlining commercial transactions. The performance of the system in terms of accuracy was assessed, and the Mean Average Precision score was over 90%. We believe that our system has the potential to make a substantial contribution to the field of assistive technology while also improving the quality of life of visually challenged persons in Nigeria and beyond.
AXLearn: Modular Large Model Training on Heterogeneous Infrastructure
Lee, Mark, Gunter, Tom, Lan, Chang, Peebles, John, Zhou, Hanzhi, Zou, Kelvin, Bangalore, Sneha, Chiu, Chung-Cheng, Du, Nan, Du, Xianzhi, Dufter, Philipp, Hou, Ruixuan, Huang, Haoshuo, Hwang, Dongseong, Kong, Xiang, Lei, Jinhao, Lei, Tao, Li, Meng, Li, Li, Lu, Jiarui, Lu, Zhiyun, Ma, Yiping, Qiu, David, Rathod, Vivek, Tong, Senyu, Tu, Zhucheng, Wang, Jianyu, Wang, Yongqiang, Wang, Zirui, Weers, Floris, Wiseman, Sam, Yin, Guoli, Zhang, Bowen, Zhou, Xiyou, Zhuo, Danyang, Leong, Cheng, Pang, Ruoming
We design and implement AXLearn, a production deep learning system that facilitates scalable and high-performance training of large deep learning models. Compared to other state-of-the-art deep learning systems, AXLearn has a unique focus on modularity and support for heterogeneous hardware infrastructure. AXLearn's internal interfaces between software components follow strict encapsulation, allowing different components to be assembled to facilitate rapid model development and experimentation on heterogeneous compute infrastructure. We introduce a novel method of quantifying modularity via Lines-of-Code (LoC)-complexity, which demonstrates how our system maintains constant complexity as we scale the components in the system, compared to linear or quadratic complexity in other systems. This allows integrating features such as Rotary Position Embeddings (RoPE) into AXLearn across hundred of modules with just 10 lines of code, compared to hundreds as required in other systems. At the same time, AXLearn maintains equivalent performance compared to state-of-the-art training systems. Finally, we share our experience in the development and operation of AXLearn.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > California > Santa Clara County > Santa Clara (0.04)
- North America > United States > California > San Diego County > Carlsbad (0.04)
- (7 more...)
QiMeng-Xpiler: Transcompiling Tensor Programs for Deep Learning Systems with a Neural-Symbolic Approach
Dong, Shouyang, Wen, Yuanbo, Bi, Jun, Huang, Di, Guo, Jiaming, Xu, Jianxing, Xu, Ruibai, Song, Xinkai, Hao, Yifan, Zhou, Xuehai, Chen, Tianshi, Guo, Qi, Chen, Yunji
Heterogeneous deep learning systems (DLS) such as GPUs and ASICs have been widely deployed in industrial data centers, which requires to develop multiple low-level tensor programs for different platforms. An attractive solution to relieve the programming burden is to transcompile the legacy code of one platform to others. However, current transcompilation techniques struggle with either tremendous manual efforts or functional incorrectness, rendering "Write Once, Run Anywhere" of tensor programs an open question. We propose a novel transcompiler, i.e., QiMeng-Xpiler, for automatically translating tensor programs across DLS via both large language models (LLMs) and symbolic program synthesis, i.e., neural-symbolic synthesis. The key insight is leveraging the powerful code generation ability of LLM to make costly search-based symbolic synthesis computationally tractable. Concretely, we propose multiple LLM-assisted compilation passes via pre-defined meta-prompts for program transformation. During each program transformation, efficient symbolic program synthesis is employed to repair incorrect code snippets with a limited scale. To attain high performance, we propose a hierarchical auto-tuning approach to systematically explore both the parameters and sequences of transformation passes. Experiments on 4 DLS with distinct programming interfaces, i.e., Intel DL Boost with VNNI, NVIDIA GPU with CUDA, AMD MI with HIP, and Cambricon MLU with BANG, demonstrate that QiMeng-Xpiler correctly translates different tensor programs at the accuracy of 95% on average, and the performance of translated programs achieves up to 2.0x over vendor-provided manually-optimized libraries. As a result, the programming productivity of DLS is improved by up to 96.0x via transcompiling legacy tensor programs.
- Europe > Austria > Vienna (0.14)
- Asia > China (0.04)
- North America > United States (0.04)
- Europe > Germany (0.04)
Less is More: Efficient Weight Farcasting with 1-Layer Neural Network
Shou, Xiao, Bhattacharjya, Debarun, Ding, Yanna, Zhao, Chen, Li, Rui, Gao, Jianxi
Addressing the computational challenges inherent in training large-scale deep neural networks remains a critical endeavor in contemporary machine learning research. While previous efforts have focused on enhancing training efficiency through techniques such as gradient descent with momentum, learning rate scheduling, and weight regularization, the demand for further innovation continues to burgeon as model sizes keep expanding. In this study, we introduce a novel framework which diverges from conventional approaches by leveraging long-term time series forecasting techniques. Our method capitalizes solely on initial and final weight values, offering a streamlined alternative for complex model architectures. We also introduce a novel regularizer that is tailored to enhance the forecasting performance of our approach. Empirical evaluations conducted on synthetic weight sequences and real-world deep learning architectures, including the prominent large language model DistilBERT, demonstrate the superiority of our method in terms of forecasting accuracy and computational efficiency. Notably, our framework showcases improved performance while requiring minimal additional computational overhead, thus presenting a promising avenue for accelerating the training process across diverse tasks and architectures.
- North America > United States > Texas > McLennan County > Waco (0.04)
- North America > United States > New York > Rensselaer County > Troy (0.04)
Implementing An Artificial Quantum Perceptron
Hathidara, Ashutosh, Pandey, Lalit
A Perceptron is a fundamental building block of a neural network. The flexibility and scalability of perceptron make it ubiquitous in building intelligent systems. Studies have shown the efficacy of a single neuron in making intelligent decisions. Here, we examined and compared two perceptrons with distinct mechanisms, and developed a quantum version of one of those perceptrons. As a part of this modeling, we implemented the quantum circuit for an artificial perception, generated a dataset, and simulated the training. Through these experiments, we show that there is an exponential growth advantage and test different qubit versions. Our findings show that this quantum model of an individual perceptron can be used as a pattern classifier. For the second type of model, we provide an understanding to design and simulate a spike-dependent quantum perceptron. Our code is available at \url{https://github.com/ashutosh1919/quantum-perceptron}
Reviews: Hybrid Reward Architecture for Reinforcement Learning
R5: Summary: This paper builds on the basic idea of the Horde architecture: learning many value functions in parallel with off-policy reinforcement learning. This paper shows that learning many value functions in parallel improves the performance on a single main task. The novelty here lies in a particular strategy for generating many different reward functions and how to combine them to generate behavior. The results show large improvements in performance in an illustrative grid world and Miss Pac-man. Decision: This paper is difficult to access.
Reviews: Toddler-Inspired Visual Object Learning
The goal of the paper is to "data mine" records of toddlers' and their mothers' fixations while playing with a set of 24 toys in order to observe what might be good training data for a deep network, given a fixed training budget. The idea is that the toddler is the best visual learning system we know, and so the data that toddlers learn from should give us a clue about what data is appropriate for deep learning. They take fixation records extracted from toddlers (16-24 mo old) and their mothers collected via scene cameras and eye tracking to examine the data distribution of infants' visual input or mothers' visual input. This study clearly falls under the cognitive science umbrella at NIPS, although they try to make it about deep learning. For example, if they only cared about deep learning, they would not use a retinal filter. First, they manually collect data recording what toys the infants and mothers are fixating on (ignoring other fixations).
Contexts Matter: An Empirical Study on Contextual Influence in Fairness Testing for Deep Learning Systems
Background: Fairness testing for deep learning systems has been becoming increasingly important. However, much work assumes perfect context and conditions from the other parts: well-tuned hyperparameters for accuracy; rectified bias in data, and mitigated bias in the labeling. Yet, these are often difficult to achieve in practice due to their resource-/labour-intensive nature. Aims: In this paper, we aim to understand how varying contexts affect fairness testing outcomes. Method:We conduct an extensive empirical study, which covers $10,800$ cases, to investigate how contexts can change the fairness testing result at the model level against the existing assumptions. We also study why the outcomes were observed from the lens of correlation/fitness landscape analysis. Results: Our results show that different context types and settings generally lead to a significant impact on the testing, which is mainly caused by the shifts of the fitness landscape under varying contexts. Conclusions: Our findings provide key insights for practitioners to evaluate the test generators and hint at future research directions.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.05)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- (7 more...)
- Information Technology (0.67)
- Banking & Finance (0.67)
- Education > Educational Setting (0.47)
VALUED -- Vision and Logical Understanding Evaluation Dataset
Saha, Soumadeep, Saha, Saptarshi, Garain, Utpal
Starting with early successes in computer vision tasks, deep learning based techniques have since overtaken state of the art approaches in a multitude of domains. However, it has been demonstrated time and again that these techniques fail to capture semantic context and logical constraints, instead often relying on spurious correlations to arrive at the answer. Since application of deep learning techniques to critical scenarios are dependent on adherence to domain specific constraints, several attempts have been made to address this issue. One limitation holding back a thorough exploration of this area, is a lack of suitable datasets which feature a rich set of rules. In order to address this, we present the VALUE (Vision And Logical Understanding Evaluation) Dataset, consisting of 200,000$+$ annotated images and an associated rule set, based on the popular board game - chess. The curated rule set considerably constrains the set of allowable predictions, and are designed to probe key semantic abilities like localization and enumeration. Alongside standard metrics, additional metrics to measure performance with regards to logical consistency is presented. We analyze several popular and state of the art vision models on this task, and show that, although their performance on standard metrics are laudable, they produce a plethora of incoherent results, indicating that this dataset presents a significant challenge for future works.
- Asia > India > West Bengal > Kolkata (0.04)
- Europe > Germany > Berlin (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- (4 more...)
- Health & Medicine (1.00)
- Leisure & Entertainment > Games > Chess (0.70)